Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normative: Add machinery to track the incumbent settings object in HTML for Job callbacks #2086

Merged
merged 1 commit into from
Aug 11, 2020

Conversation

syg
Copy link
Contributor

@syg syg commented Jul 9, 2020

In HTML, JS functions that are called asynchronously as part of jobs
(like Promise microtasks) are generally associated with a piece of state
called an incumbent settings object. The incumbent is stored at the time
when the callback is first passed to the API that is responsible for
scheduling the job (e.g. Promise.then), and restored when the callback
is called.

For symmetry, all WebIDL callbacks do this:
https://heycam.github.io/webidl/#idl-callback-interface

This PR adds the necessary host hook machinery to let HTML store and
restore the incumbent settings object. The default behavior is nop.

cc @domenic @annevk @codehag

@syg syg added normative change Affects behavior required to correctly evaluate some ECMAScript source text layering affects the public spec interface and may require updates to integrating specs labels Jul 9, 2020
@syg
Copy link
Contributor Author

syg commented Jul 9, 2020

Note that I deliberately did not add the "needs tests" tag, because this is not a thing for ecma262 to test.

Copy link
Member

@ljharb ljharb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand what improvements this will allow in the HTML spec?

spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
Copy link
Member

@TimothyGu TimothyGu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to see an HTML pull request that makes use of these instruments.

spec.html Outdated
[[Callback]]
</td>
<td>
A callable object, or *undefined*
Copy link
Member

@TimothyGu TimothyGu Jul 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this could be anything other than a callable object, in fact.

And while at it, I'm not sure what value there is to be had to say "callable object" rather than just "function"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention was to call out revocable Proxies.

spec.html Outdated Show resolved Hide resolved
@syg
Copy link
Contributor Author

syg commented Jul 9, 2020

@TimothyGu @domenic One thing I'm not clear on: currently for Promises, HTML does prepare to run a callback in step 6.3 in its definition of HostEnqueuePromiseJob. That is, it does it as part of running the job, before it gets to invoking the function.

To be more in line with how all WebIDL-based callbacks work, that step would be moved to the actual invocation of the function. That seems to have an observable difference because of step 6.4. If there's an active script context, if prepare to run a script is performed after 6.4, it'd increment the skip-when-determining counter. What effect does that change have? Is that desirable?

There's no interop for this case at the moment so we can make something up, I think.

Edit: Tangential question: shouldn't WebIDL also propagate the active script or module, for the same reason that Promise microtasks do, for dynamic import()?

@syg
Copy link
Contributor Author

syg commented Jul 9, 2020

Can you help me understand what improvements this will allow in the HTML spec?

@ljharb HTML has several kinds of "settings objects": https://html.spec.whatwg.org/multipage/webappapis.html#realms-settings-objects-global-objects, which ecma262 spec authors can mostly think of as 1:1 with a Realm, but HTML has other state in there in addition to the Realm.

One of them is called the "incumbent settings object", which is a legacy thing that is kinda complicated but nevertheless is used everywhere, because all WebIDL callbacks support it.

The idea is that the incumbent corresponds to the Realm that was responsible for some code having been run. The tricky thing is: for async callbacks, the Realm that was responsible for some code having been run is the Realm at the point whatever API was called that caused the callback to eventually be scheduled and run. Concretely, for Promises, this is at the point of, e.g., calling Promise#then. Note that that point can be a different point in time from when the microtask is actually scheduled (because Promise reactions can be triggered later) and from when the microtask actually runs! So, to correctly propagate this incumbent settings object, it needs to be captured and associated with the callback at the point when Promise#then is called. To do so, ecma262 needs additional host hooks, since it's a different point in time from when the Promise job is enqueued.

Currently, Chrome and Safari do not propagate the incumbent settings object for Promises because we never specified it. Firefox does. All browsers propagate the incumbent for APIs that use WebIDL, because the bindgen takes care of it. This change is to enable our async callbacks that must get scheduled by the host to be consistent with the rest of the web platform.

It's very corner-casey, but still a point of divergence.

Does that make sense?

spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
@syg
Copy link
Contributor Author

syg commented Jul 9, 2020

Thinking out loud here, after I wrote out #2086 (comment), it made me realize perhaps the approach in this PR is wrong, and that the incumbent should be associated per Job instead of per Job callback, since the existing hook points to perform custom steps are before and after invocation of the Job itself.

That should address @TimothyGu's question in #2086 (comment) as well, because then there's no need for any host hooks for prepare to run callback or clean up after running callback.

There's no reason to support the expressivity this PR currently offers, of letting a Job have multiple callbacks with different incumbents.

@domenic
Copy link
Member

domenic commented Jul 9, 2020

Interesting... thanks for working through all this.

Tangential question: shouldn't WebIDL also propagate the active script or module, for the same reason that Promise microtasks do, for dynamic import()?

I think that would be a good idea. In particular, HostEnqueuePromiseJob talks about making Promise.resolve('import(`./example.mjs`)').then(eval) propagate forward the active script correctly. For the counterpart case of setTimeout(eval, 0, 'import(`./example.mjs`)'), we have the same problem. Great catch.

To be more in line with how all WebIDL-based callbacks work, that step would be moved to the actual invocation of the function. That seems to have an observable difference because of step 6.4. If there's an active script context, if prepare to run a script is performed after 6.4, it'd increment the skip-when-determining counter. What effect does that change have? Is that desirable?

In general I think we should align with Web IDL-based callbacks. I am pretty sure that the current ordering does not deliberately depart, but instead we just didn't think this through.

However, I think the alignment is in favor of the current HostEnqueuePromiseJob sequence, i.e. prepare to run script / prepare to run a callback / propagate active script / call the function, instead of in favor of the current Web IDL sequence (prepare to run script / prepare to run a callback / call the function) or the modified version you are suggesting (propagate active script / prepare to run script / prepare to run a callback / call the function).

Details with an example In particular, let's consider the final example from the bottom of https://html.spec.whatwg.org/multipage/webappapis.html#incumbent:
<!-- a.html -->
<!DOCTYPE html>
<button>click me</button>
<iframe></iframe>
<script>
const bound = frames[0].location.assign.bind(frames[0].location, "https://example.com/");
document.querySelector("button").addEventListener("click", bound);
</script>

<!-- b.html -->
<!DOCTYPE html>
<iframe src="a.html"></iframe>
<script>
  const iframe = document.querySelector("iframe");
  iframe.onload = function onLoad() {
    iframe.contentWindow.document.querySelector("button").click();
  };
</script>

where the incumbent settings object should be that of a.html. When the click event is dispatched, we will:

  1. Prepare to run script: push b.html's realm execution context onto the stack
  2. Prepare to call a callback:
  • push a.html's settings object onto the backup incumbent settings object stack
  • Increment the topmost script-having execution context's skip-when-determining-incumbent counter: that of the onLoad function
  1. Propagate the active script: push the <script>'s execution context onto the execution context stack

If we instead reversed the order of (2) and (3), then we would increment the <script>'s execution context's skip-when-determining-incumbent counter, but fail to increment onLoad's, which breaks the incumbent-determination logic.

I will send a tentative Web IDL PR with these semantics.

As a side note, it would be ideal if we could somehow reuse the Web IDL machinery in HostEnqueuePromiseJob, instead of duplicating it, so that they always stay in sync. The most straightforward way I can think of to do this would be to allow "converting" from a raw JS function into a Web IDL Function (i.e. callback type instance), and then instead of JS-Call()ing that function, we can later Web IDL-invoke it. That might also benefit Weak References or other future cases. But I am not sure how well that would layer with ES...

domenic added a commit to whatwg/webidl that referenced this pull request Jul 9, 2020
@syg
Copy link
Contributor Author

syg commented Jul 9, 2020

As a side note, it would be ideal if we could somehow reuse the Web IDL machinery in HostEnqueuePromiseJob, instead of duplicating it, so that they always stay in sync. The most straightforward way I can think of to do this would be to allow "converting" from a raw JS function into a Web IDL Function (i.e. callback type instance), and then instead of JS-Call()ing that function, we can later Web IDL-invoke it. That might also benefit Weak References or other future cases. But I am not sure how well that would layer with ES...

There's a mismatch here that'll make that hard, because WebIDL AFAICT does not distinguish a Job from a function that's called as part of a Job. ecma262 has a finer grained notion of Job-vs-callback function. For Promise reactions, for example, we schedule a Job with a possibly undefined handler function. I don't think we should be making all Jobs actual JS functions instead of abstract closures, which they are now. Observably I don't think it matters, but it's less error prone to not require synthetic JS functions.

For alignment with the WebIDL callbacks, I think the way to go is to align on the invariant that there's a propagated incumbent per Job, not per JS function. That is, I find it exceedingly weird if a single Job could call different callbacks with different incumbents. Then, have WebIDL somehow override the necessary points in ecma262, and have HTML normatively reference WebIDL for Promises / WeakRefs. I'll queue that work as a followup.

However, I think the alignment is in favor of the current HostEnqueuePromiseJob sequence

Thanks for thinking this through. I'll rework this PR to track incumbents per-Job and to preserve the status quo sequencing.

@domenic
Copy link
Member

domenic commented Jul 9, 2020

There's a mismatch here that'll make that hard

That's understandable. I know this is already pretty heroic host-integration work, so I'm sympathetic to keeping the burden on 262 light. We can maintain duplicated logic in HostEnqueuePromiseJob plus Web IDL, or figure out some refactoring on our side to make it better there. I was just mentioning it as a nice-to-have.

(In particular, your "have HTML normatively reference WebIDL for Promises / WeakRefs" sounds like this kind of refactoring. I also noticed that Web IDL currently has its own duplication problems, as you can see from the three modification sites in whatwg/webidl#902. I'm happy to help with this once the 262 infrastructure is in place, although I appreciate your volunteering.)

For alignment with the WebIDL callbacks, I think the way to go is to align on the invariant that there's a propagated incumbent per Job, not per JS function. That is, I find it exceedingly weird if a single Job could call different callbacks with different incumbents.

Although currently I don't think this is the case (although I'm not very familiar with how weak references use jobs), there's nothing about this that would seem exceedingly weird to me. In particular, it's easy to imagine a mechanism that allows multiple call sites (with different incumbents) to "batch" up a series of function calls into a single job. If you're saying that's likely to never be introduced in 262, and instead you'll do at most one function per job, then that works too. But multiple functions per job seems within the realm of future possibility.

@syg
Copy link
Contributor Author

syg commented Jul 9, 2020

Although currently I don't think this is the case (although I'm not very familiar with how weak references use jobs), there's nothing about this that would seem exceedingly weird to me. In particular, it's easy to imagine a mechanism that allows multiple call sites (with different incumbents) to "batch" up a series of function calls into a single job. If you're saying that's likely to never be introduced in 262, and instead you'll do at most one function per job, then that works too. But multiple functions per job seems within the realm of future possibility.

No, I wasn't saying something as strong as "likely to never be introduced in 262", was more of my personal opinion. The batching case sounds plausible to me. If that's something that might happen in the future, then the per-function approach in the current PR is the desirable one. The downside is we'd have to rejigger the steps in the current HTML definition of HostEnqueuePromiseJob.

I guess I'll write up both and compare the complexity.

@syg
Copy link
Contributor Author

syg commented Jul 9, 2020

Continuing with the per-function approach, which with two host hooks, HostMakeJobCallback and HostCallJobCallback, is fairly clean imo.

I'll draft an HTML PR later today to use these.

Copy link
Member

@ljharb ljharb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems pretty straightforward now, thanks. Looking forward to seeing the HTML PR.

spec.html Outdated
@@ -7751,8 +7751,8 @@ <h1>HostCallJobCallback ( _jobCallback_, _V_ [ , _argumentsList_ ] )</h1>
<p>HostCallJobCallback is a host-defined abstract operation that argument _jobCallback_ (a JobCallback Record), _V_ (an ECMAScript language value) and optional argument _argumentsList_ (a List of ECMAScript language values).</p>
<p>The implementation of HostCallJobCallback must conform to the following requirements:</p>
<ul>
<li>If _argumentsList_ is not present, it must always return the result of Call(_jobCallback_.[[Callback]], _V_).</li>
<li>If _argumentsList_ is present, it must always return the result of Call(_jobCallback_.[[Callback]], _V_, _argumentsList_).</li>
<li>If _argumentsList_ is not present, it must always perform and return the result of Call(_jobCallback_.[[Callback]], _V_).</li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tfw not allowed to use ai to predict the result of Call() instead of performing it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's unobservable that you've skipped it then you certainly could avoid performing it :-p

spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
syg added a commit to syg/html that referenced this pull request Jul 10, 2020
This depends on the host hooks that ecma262 is adding in [1].

[1] tc39/ecma262#2086
spec.html Outdated Show resolved Hide resolved
@syg syg force-pushed the incumbent-host-hooks branch from 4606508 to 749e5db Compare July 21, 2020 23:53
@syg
Copy link
Contributor Author

syg commented Jul 22, 2020

@erights, @bmeck, @FUDCo, @codehag and I had a call to try to address @erights's and @bmeck's concerns to this PR when I first presented.

First, the changes that should address their concerns:

  1. HostMakeJobCallback and HostCallJobCallback are restricted to web browser hosts. Hosts that are not web browsers must use the default implementations.

  2. Add an additional note that points out the implication of the requirement of HostCallJobCallback, that hosts are not allowed to override the [[Call]] behavior of callable objects that ecma262 specifies, which in turn means that host cannot use these hooks to e.g. make ECMAScript code execution dynamically scoped since the [[Call]] algorithms explicitly describe the execution contexts to push.

Second, to record their concerns and clear up some misunderstandings for posterity.

@erights's primary concern was that these host hooks would allow hosts to execute JS code in a dynamically scoped way. While true for host-defined exotic callable objects, the behavior of function calls for callable objects specified in ecma262 was never hookable. Further, @erights considers the extra expressivity allowed by these host hooks in general undesirable. As a compromise, these host hooks are restricted to web browser hosts.

@bmeck had similar concerns in considering the extra expressivity for allowing incumbent-like that's allowed here in general undesirable, specifically for Node.

There was also general confusion where I was explaining that built-in function objects as specified by ecma262 do not have [[Realms]]. This was incorrect, which resulted in a lot of misunderstanding. CreateBuiltinFunction does capture a [[Realm]]. Mea culpa.

@ljharb ljharb added the has consensus This has committee consensus. label Jul 22, 2020
spec.html Outdated Show resolved Hide resolved
spec.html Show resolved Hide resolved
@ljharb ljharb requested a review from michaelficarra July 30, 2020 06:13
@ljharb ljharb requested review from bakkot and a team July 30, 2020 06:13
@ljharb ljharb self-assigned this Jul 30, 2020
@syg syg force-pushed the incumbent-host-hooks branch from 16a2c0b to e08a3ee Compare August 6, 2020 18:12
@syg
Copy link
Contributor Author

syg commented Aug 6, 2020

Rebased

spec.html Outdated Show resolved Hide resolved
spec.html Show resolved Hide resolved
spec.html Show resolved Hide resolved
spec.html Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
@syg syg force-pushed the incumbent-host-hooks branch from d388f93 to bc8e6aa Compare August 11, 2020 18:54
@syg
Copy link
Contributor Author

syg commented Aug 11, 2020

Rebased

@ljharb ljharb requested a review from bakkot August 11, 2020 20:28
@ljharb ljharb changed the title Add machinery to track the incumbent settings object in HTML for Job callbacks Normative: Add machinery to track the incumbent settings object in HTML for Job callbacks Aug 11, 2020
…ML for Job callbacks (tc39#2086)

In HTML, JS functions that are called asynchronously as part of jobs
(like Promise microtasks) are generally associated with a piece of state
called an incumbent settings object. The incumbent is stored at the time
when the callback is first passed to the API that is responsible for
scheduling the job (e.g. Promise.then), and restored when the callback
is called.

For symmetry, all WebIDL callbacks do this:
https://heycam.github.io/webidl/#idl-callback-interface

This PR adds the necessary host hook machinery to let HTML store and
restore the incumbent settings object. The default behavior is nop.
@ljharb ljharb force-pushed the incumbent-host-hooks branch from bc8e6aa to 4069869 Compare August 11, 2020 22:35
@ljharb ljharb merged commit 4069869 into tc39:master Aug 11, 2020
h2oche pushed a commit to h2oche/ecma262 that referenced this pull request Aug 16, 2020
…ML for Job callbacks (tc39#2086)

In HTML, JS functions that are called asynchronously as part of jobs
(like Promise microtasks) are generally associated with a piece of state
called an incumbent settings object. The incumbent is stored at the time
when the callback is first passed to the API that is responsible for
scheduling the job (e.g. Promise.then), and restored when the callback
is called.

For symmetry, all WebIDL callbacks do this:
https://heycam.github.io/webidl/#idl-callback-interface

This PR adds the necessary host hook machinery to let HTML store and
restore the incumbent settings object. The default behavior is nop.
@ByteEater-pl
Copy link

From the notes:

The prose now says that hosts that are not browser must follow the default behavior. I also added a note that any host cannot override behavior that ECMA262 specifies.

This bifurcates the language into a dialect for browsers and for elsewhere. It also blesses browsers with some extensibility denied to all other implementations, whatever their concerns may be. If there are any limitations or invariants those operations must obey, I believe it's better to spec them and allow all implementations to do whatever they need in a compliant way (possibly also specifying a sane default that MAY or even SHOULD be used).

@bakkot
Copy link
Contributor

bakkot commented Aug 16, 2020

This bifurcates the language into a dialect for browsers and for elsewhere. It also blesses browsers with some extensibility denied to all other implementations, whatever their concerns may be.

This isn't the first time that's happened: the [[IsHTMLDDA]] internal slot is forbidden to non-browser implementations.

In general web browsers have a lot of quirks which they are effectively stuck with. Just because some extensibility is necessary to specify existing browser quirks doesn't mean it's a good idea for any implementation which is not a browser to make use of that extensibility.

@bmeck
Copy link
Member

bmeck commented Aug 16, 2020

@ByteEater-pl I generally agree with your sentiments. However, in this case the issue is that using these hooks as the browser does introduces dynamic scoping into the language which the language builtins do protect against intentionally and various members do not want in the core language itself. To be more concrete on the issue we can describe it in what happens in HTML:

  • Given 2 realms A and B
  • Have a binding in B that can enqueue a Job onto A (e.g. HTML's postMessage)
  • Have a binding in A that uses the incumbent realm to do something (such as HTML's promise Job queue and an evaluator).
  • B enqueues a Job on A that eventually runs one of A's functions in A, no direct reference to B.
  • A's code in A's realm queued on A's job queue will enqueue a Job in B and use B's host hooks once it evaluates for things like import.

The introduction of dynamic scoping means the spec would then need to compensate for it in every possible future language design decision. As it is, we are fixing layering issues with the HTML specification but not trying to adopt the design choices into JS the language, as @bakkot mentions. I would note that this change is only visible and applicable when a host writes very specific hooks as JS implementations of builtins always ensure they have the realm in which they were created tied to them as seen by GetFunctionRealm

@ByteEater-pl
Copy link

I'm aware of [[IsHTMLDDA]] but it's clearly a legacy thing, not recommended and with an easy workaround (which can even be applied automatically by tooling to existing code) to obtain something equivalent not relying on document.all being falsy. I'm not familiar enough with browser quirks to determine if that's also the case here. Is it?

@bmeck
Copy link
Member

bmeck commented Aug 16, 2020

@ByteEater-pl to my knowledge it is not possible using WASM or JS to emulate this behavior, it must be done using some very specific browser APIs, of which there are very few.

@ByteEater-pl
Copy link

@bmeck, it's certainly good to know, but I'm not sure how it relates to my concern. So let me rephrase. Is the proposed behaviour required only for some legacy code to work (like [[IsHTMLDDA]]) but you can easily do everything (access full functionality) without relying on it and this is indeed recommended by the HTML5 folks? Or are there actually non-legacy use cases? If the former, is there a universal way (possibly one that could be applied automatically by tooling on user request) to transform code into equivalent one working even with non-browser implementations (assuming the existence of equivalent host objects in them)?

@bmeck
Copy link
Member

bmeck commented Aug 18, 2020

@ByteEater-pl to my knowledge this is only about covering a case where HTML was unable to properly determine the realm to run code in. The APIs are not deprecated but you cannot replicate the behavior via WASM or JS itself. This cannot be automatically detected or transformed.

@syg
Copy link
Contributor Author

syg commented Aug 18, 2020

This is primarily a layering thing as folks have said. I'm still weakly for not making this a browser-specific extension point, but I don't feel strongly, certainly not relative to other delegates.

@bmeck To be more charitable to HTML, it doesn't really use the incumbent Realm concept to execute JS code, since if there were JS code to execute, they'd already have a Realm. Instead this is for native methods that can end up being queued as tasks. I still quarrel with the characterization that this is full-blown dynamic scoping, though it's fair enough that it means tasks in HTML, as a concept that may execute more than just JS code, don't have full lexical scoping.

@ByteEater-pl Do you have a concrete use case for these extension points for non-browser implementations?

@bmeck
Copy link
Member

bmeck commented Aug 18, 2020

I still quarrel with the characterization that this is full-blown dynamic scoping, though it's fair enough that it means tasks in HTML, as a concept that may execute more than just JS code, don't have full lexical scoping.

I'd prefer we avoid this detailed argument here.

@ByteEater-pl
Copy link

@ByteEater-pl Do you have a concrete use case for these extension points for non-browser implementations?

No. I guess that bridge can be crossed when and if needs be. Though maybe a note is warranted to clarify it.

To be more charitable to HTML, it doesn't really use the incumbent Realm concept to execute JS code, since if there were JS code to execute, they'd already have a Realm. Instead this is for native methods that can end up being queued as tasks.

I'm still trying to understand why such an exceptional power for browser implementations is seen as necessary for this. Aren't native methods hosts objects, thus allowed to do virtually anything (though the requirement to preserve essential object invariants comes to mind), including creating realms, enqueueing jobs and rearranging existing ones?

@bakkot
Copy link
Contributor

bakkot commented Aug 24, 2020

Aren't native methods hosts objects, thus allowed to do virtually anything

Native methods are indeed very powerful, but the problem is that the information they need to do the "right" thing in this case is not available to them without this change, because that information is only available at an earlier point (in this case, at the point at which Promise.then ran). See @syg's comment above. It might also help to take a look at the slides from when he presented this PR.

syg added a commit to syg/html that referenced this pull request Dec 16, 2020
This depends on the host hooks that ecma262 is adding in [1].

[1] tc39/ecma262#2086
sideshowbarker pushed a commit to syg/html that referenced this pull request Jan 31, 2021
This depends on the host hooks that ecma262 is adding in [1].

[1] tc39/ecma262#2086
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
has consensus This has committee consensus. layering affects the public spec interface and may require updates to integrating specs normative change Affects behavior required to correctly evaluate some ECMAScript source text
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants